Hybrid unmanned aerial vehicles (UAVs) integrate the efficient forward flight of fixed-wing and vertical takeoff and landing (VTOL) capabilities of multicopter UAVs. This paper presents the modeling, control and simulation of a new type of hybrid micro-small UAVs, coined as lifting-wing quadcopters. The airframe orientation of the lifting wing needs to tilt a specific angle often within $ 45$ degrees, neither nearly $ 90$ nor approximately $ 0$ degrees. Compared with some convertiplane and tail-sitter UAVs, the lifting-wing quadcopter has a highly reliable structure, robust wind resistance, low cruise speed and reliable transition flight, making it potential to work fully-autonomous outdoor or some confined airspace indoor. In the modeling part, forces and moments generated by both lifting wing and rotors are considered. Based on the established model, a unified controller for the full flight phase is designed. The controller has the capability of uniformly treating the hovering and forward flight, and enables a continuous transition between two modes, depending on the velocity command. What is more, by taking rotor thrust and aerodynamic force under consideration simultaneously, a control allocation based on optimization is utilized to realize cooperative control for energy saving. Finally, comprehensive Hardware-In-the-Loop (HIL) simulations are performed to verify the advantages of the designed aircraft and the proposed controller.
translated by 谷歌翻译
In this paper, we propose an effective unified control law for accurately tracking agile trajectories for lifting-wing quadcopters with different installation angles, which have the capability of vertical takeoff and landing (VTOL) as well as high-speed cruise flight. First, we derive a differential flatness transform for the lifting-wing dynamics with a nonlinear model under coordinated turn condition. To increase the tracking performance on agile trajectories, the proposed controller incorporates the state and input variables calculated from differential flatness as feedforward. In particular, the jerk, the 3-order derivative of the trajectory, is converted into angular velocity as a feedforward item, which significantly improves the system bandwidth. At the same time, feedback and feedforward outputs are combined to deal with external disturbances and model mismatch. The control algorithm has been thoroughly evaluated in the outdoor flight tests, which show that it can achieve accurate trajectory tracking.
translated by 谷歌翻译
As the quality of optical sensors improves, there is a need for processing large-scale images. In particular, the ability of devices to capture ultra-high definition (UHD) images and video places new demands on the image processing pipeline. In this paper, we consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution. We conduct systematic benchmarking studies and provide a comparison of current LLIE algorithms. As a second contribution, we introduce LLFormer, a transformer-based low-light enhancement method. The core components of LLFormer are the axis-based multi-head self-attention and cross-layer attention fusion block, which significantly reduces the linear complexity. Extensive experiments on the new dataset and existing public datasets show that LLFormer outperforms state-of-the-art methods. We also show that employing existing LLIE methods trained on our benchmark as a pre-processing step significantly improves the performance of downstream tasks, e.g., face detection in low-light conditions. The source code and pre-trained models are available at https://github.com/TaoWangzj/LLFormer.
translated by 谷歌翻译
Image restoration under hazy weather condition, which is called single image dehazing, has been of significant interest for various computer vision applications. In recent years, deep learning-based methods have achieved success. However, existing image dehazing methods typically neglect the hierarchy of features in the neural network and fail to exploit their relationships fully. To this end, we propose an effective image dehazing method named Hierarchical Contrastive Dehazing (HCD), which is based on feature fusion and contrastive learning strategies. HCD consists of a hierarchical dehazing network (HDN) and a novel hierarchical contrastive loss (HCL). Specifically, the core design in the HDN is a Hierarchical Interaction Module, which utilizes multi-scale activation to revise the feature responses hierarchically. To cooperate with the training of HDN, we propose HCL which performs contrastive learning on hierarchically paired exemplars, facilitating haze removal. Extensive experiments on public datasets, RESIDE, HazeRD, and DENSE-HAZE, demonstrate that HCD quantitatively outperforms the state-of-the-art methods in terms of PSNR, SSIM and achieves better visual quality.
translated by 谷歌翻译
Pre-trained language models for programming languages have shown a powerful ability on processing many Software Engineering (SE) tasks, e.g., program synthesis, code completion, and code search. However, it remains to be seen what is behind their success. Recent studies have examined how pre-trained models can effectively learn syntax information based on Abstract Syntax Trees. In this paper, we figure out what role the self-attention mechanism plays in understanding code syntax and semantics based on AST and static analysis. We focus on a well-known representative code model, CodeBERT, and study how it can learn code syntax and semantics by the self-attention mechanism and Masked Language Modelling (MLM) at the token level. We propose a group of probing tasks to analyze CodeBERT. Based on AST and static analysis, we establish the relationships among the code tokens. First, Our results show that CodeBERT can acquire syntax and semantics knowledge through self-attention and MLM. Second, we demonstrate that the self-attention mechanism pays more attention to dependence-relationship tokens than to other tokens. Different attention heads play different roles in learning code semantics; we show that some of them are weak at encoding code semantics. Different layers have different competencies to represent different code properties. Deep CodeBERT layers can encode the semantic information that requires some complex inference in the code context. More importantly, we show that our analysis is helpful and leverage our conclusions to improve CodeBERT. We show an alternative approach for pre-training models, which makes fully use of the current pre-training strategy, i.e, MLM, to learn code syntax and semantics, instead of combining features from different code data formats, e.g., data-flow, running-time states, and program outputs.
translated by 谷歌翻译
Recent deep learning methods have achieved promising results in image shadow removal. However, their restored images still suffer from unsatisfactory boundary artifacts, due to the lack of degradation prior embedding and the deficiency in modeling capacity. Our work addresses these issues by proposing a unified diffusion framework that integrates both the image and degradation priors for highly effective shadow removal. In detail, we first propose a shadow degradation model, which inspires us to build a novel unrolling diffusion model, dubbed ShandowDiffusion. It remarkably improves the model's capacity in shadow removal via progressively refining the desired output with both degradation prior and diffusive generative prior, which by nature can serve as a new strong baseline for image restoration. Furthermore, ShadowDiffusion progressively refines the estimated shadow mask as an auxiliary task of the diffusion generator, which leads to more accurate and robust shadow-free image generation. We conduct extensive experiments on three popular public datasets, including ISTD, ISTD+, and SRD, to validate our method's effectiveness. Compared to the state-of-the-art methods, our model achieves a significant improvement in terms of PSNR, increasing from 31.69dB to 34.73dB over SRD dataset.
translated by 谷歌翻译
Face Restoration (FR) aims to restore High-Quality (HQ) faces from Low-Quality (LQ) input images, which is a domain-specific image restoration problem in the low-level computer vision area. The early face restoration methods mainly use statistic priors and degradation models, which are difficult to meet the requirements of real-world applications in practice. In recent years, face restoration has witnessed great progress after stepping into the deep learning era. However, there are few works to study deep learning-based face restoration methods systematically. Thus, this paper comprehensively surveys recent advances in deep learning techniques for face restoration. Specifically, we first summarize different problem formulations and analyze the characteristic of the face image. Second, we discuss the challenges of face restoration. Concerning these challenges, we present a comprehensive review of existing FR methods, including prior based methods and deep learning-based methods. Then, we explore developed techniques in the task of FR covering network architectures, loss functions, and benchmark datasets. We also conduct a systematic benchmark evaluation on representative methods. Finally, we discuss future directions, including network designs, metrics, benchmark datasets, applications,etc. We also provide an open-source repository for all the discussed methods, which is available at https://github.com/TaoWangzj/Awesome-Face-Restoration.
translated by 谷歌翻译
当使用深度学习技术对程序语言进行建模时,广泛采用了带有树或图形结构的神经网络,以捕获程序抽象语法树(AST)中的丰富结构信息。但是,计划中广泛存在长期/全球依赖性,大多数这些神经体系结构无法捕获这些依赖性。在本文中,我们提出了Tree-Transformer,这是一种新型的递归树结构神经网络,旨在克服上述局限性。树转化器利用两个多头注意单元来建模兄弟姐妹和父子节点对之间的依赖关系。此外,我们提出了一个双向传播策略,以允许节点信息向两个方向传递:沿树木的自下而上和自上而下。通过结合自下而上和自上而下的传播,树转化器可以同时学习全局上下文和有意义的节点特征。广泛的实验结果表明,我们的树转换器在具有树级和节点级别的预测任务中,在与程序相关的任务中优于现有基于树或基于图的神经网络,这表明Tree-Transformer在学习两个树级时都表现良好和节点级表示。
translated by 谷歌翻译
摆脱拟合配对训练数据的基本限制,最近无监督的低光增强方法在调整图像的照明和对比度方面表现出色。但是,对于无监督的低光增强,由于缺乏对详细信号的监督而导致的剩余噪声抑制问题在很大程度上阻碍了这些方法在现实世界应用中的广泛部署。在本文中,我们提出了一种新型的自行车相互作用生成对抗网络(CIGAN),以实现无监督的低光图像增强,它不仅能够更好地在低/正常光图像之间更好地传输照明分布,还可以操纵两个域之间的详细信号,例如。 ,在环状增强/降解过程中抑制/合成逼真的噪声。特别是,提出的低光引导转换馈送馈送从增强gan(Egan)发电机的低光图像的特征到降解GAN(DGAN)的发生器。借助真正的弱光图像的信息,DGAN可以在低光图像中综合更逼真的不同照明和对比度。此外,DGAN中的特征随机扰动模块学会了增加特征随机性以产生各种特征分布,从而说服了合成的低光图像以包含逼真的噪声。广泛的实验既证明了所提出的方法的优越性,又证明了每个模块在CIGAN中的有效性。
translated by 谷歌翻译
盲人面部修复(BFR)旨在从低品质的图像中恢复高质量的面部图像,并通常求助于面部先验,以改善恢复性能。但是,当前的方法仍然遇到两个主要困难:1)如何在不进行大规模调整的情况下得出强大的网络体系结构; 2)如何从一个网络中的多个面部先验捕获互补信息以提高恢复性能。为此,我们提出了一个面部修复搜索网络(FRSNET),以适应我们指定的搜索空间内的合适特征提取体系结构,这可以直接有助于恢复质量。在FRSNET的基础上,我们通过多个学习方案进一步设计了多个面部先验搜索网络(MFPSNET)。 MFPSNET最佳地从不同的面部先验中提取信息,并将信息融合到图像特征中,以确保保留外部指导和内部特征。通过这种方式,MFPSNet充分利用了语义级别(解析图),几何级别(面部热图),参考级别(面部词典)和像素级(降级图像)信息,从而产生忠实且逼真的图像。定量和定性实验表明,MFPSNET在合成和现实世界数据集上对最先进的BFR方法表现出色。这些代码可公开可用:https://github.com/yyj1ang/mfpsnet。
translated by 谷歌翻译